Skip to content

Support concatenated gzip (mgzip/pigz) layers in ztoc#1871

Open
prafgup wants to merge 1 commit into
awslabs:mainfrom
prafgup:prafulg/multiple-gzip-members
Open

Support concatenated gzip (mgzip/pigz) layers in ztoc#1871
prafgup wants to merge 1 commit into
awslabs:mainfrom
prafgup:prafulg/multiple-gzip-members

Conversation

@prafgup
Copy link
Copy Markdown
Contributor

@prafgup prafgup commented Feb 23, 2026

Issue #, if available:
Fixes #1834. The C-level zinfo code only processed the first gzip member, causing all ztoc entries to map to span 0 for mgzip-compressed layers. This adds handling for multiple concatenated gzip members in all three functions: index generation, file-based extraction, and buffer-based extraction.

Description of changes:

Testing performed:

  • Added unit test -
lima sudo go test -v -count=1 -run TestDecompressWithPigz ./ztoc/...
=== RUN   TestDecompressWithPigz
=== RUN   TestDecompressWithPigz/span_size_64KB
=== RUN   TestDecompressWithPigz/span_size_256KB
=== RUN   TestDecompressWithPigz/span_size_1MB
--- PASS: TestDecompressWithPigz (0.53s)
    --- PASS: TestDecompressWithPigz/span_size_64KB (0.17s)
    --- PASS: TestDecompressWithPigz/span_size_256KB (0.17s)
    --- PASS: TestDecompressWithPigz/span_size_1MB (0.18s)
PASS
ok      github.com/awslabs/soci-snapshotter/ztoc        0.540s
PASS
ok      github.com/awslabs/soci-snapshotter/ztoc/compression    0.002s [no tests to run]
?       github.com/awslabs/soci-snapshotter/ztoc/compression/fbs/zinfo  [no test files]
?       github.com/awslabs/soci-snapshotter/ztoc/fbs/ztoc       [no test files]
lima sudo GO_TEST_FLAGS="-run TestSociZtocWithPigzLayers -count=1" make integration
cd cmd/ ; GO111MODULE=auto go build -o /Volumes/git/prafulg/soci-snapshotter/out/soci-snapshotter-grpc  -ldflags '-X github.com/awslabs/soci-snapshotter/version.Version=7d69a740.m -X github.com/awslabs/soci-snapshotter/version.Revision=7d69a7404b82eaeca6ccf229a1d4566aa920c8be.m  -s -w '  ./soci-snapshotter-grpc
cd cmd/ ; GO111MODULE=auto go build -o /Volumes/git/prafulg/soci-snapshotter/out/soci  -ldflags '-X github.com/awslabs/soci-snapshotter/version.Version=7d69a740.m -X github.com/awslabs/soci-snapshotter/version.Revision=7d69a7404b82eaeca6ccf229a1d4566aa920c8be.m  -s -w '  ./soci
integration
SOCI_SNAPSHOTTER_PROJECT_ROOT=/Volumes/git/prafulg/soci-snapshotter
=== RUN   TestSociZtocWithPigzLayers
=== PAUSE TestSociZtocWithPigzLayers
=== CONT  TestSociZtocWithPigzLayers
--- PASS: TestSociZtocWithPigzLayers (3.41s)
PASS
ok      github.com/awslabs/soci-snapshotter/integration 116.912
{
  "version": "0.9",
  "build_tool": "AWS SOCI CLI v0.2",
  "size": 7262136,
  "span_size": 4194304,
  "num_spans": 118,
  "num_files": 15553,
  "num_multi_span_files": 80,
  "files": [
    {
      "filename": "app/var/data/conda-env/x86_64-conda-linux-gnu",
      "offset": 512,
      "size": 0,
      "type": "dir",
      "start_span": 0,
      "end_span": 0
    },
    {
      "filename": "app/var/data/conda-env/x86_64-conda-linux-gnu/bin",
      "offset": 1024,
      "size": 0,
      "type": "dir",
      "start_span": 0,
      "end_span": 0
    },
....
  {
    "filename": "app/var/data/conda-env/lib/libopenblasp-r0.3.30.so",
    "offset": 97394176,
    "size": 41424320,
    "type": "reg",
    "start_span": 23,
    "end_span": 32
  }
....
    {
      "filename": "app/var/data/conda-env/man/man1/bzegrep.1",
      "offset": 498103296,
      "size": 18,
      "type": "reg",
      "start_span": 117,
      "end_span": 117
    },
    {
      "filename": "app/var/data/conda-env/man/man1/bzcmp.1",
      "offset": 498104320,
      "size": 18,
      "type": "reg",
      "start_span": 117,
      "end_span": 117
    },
    {
      "filename": "app/var/data/conda-env/man/man1/bzmore.1",
      "offset": 498105344,
      "size": 4310,
      "type": "reg",
      "start_span": 117,
      "end_span": 117
    }
  ]
}
  • Manual test -
sudo soci convert --all-platforms xx/functions/user-function:0.2.0 xx/functions/user-function:0.2.0-soci

sudo nerdctl push --all-platforms xx/functions/user-function:0.2.0

-- clean all containerd and soci cache

lima sudo bash -c 'systemctl stop soci-snapshotter-grpc && cp /Volumes/git/prafulg/soci-snapshotter/out/soci-snapshotter-grpc /usr/local/bin/soci-snapshotter-grpc && cp /Volumes/git/prafulg/soci-snapshotter/out/soci /usr/local/bin/soci && systemctl start
  soci-snapshotter-grpc && sleep 2 && systemctl is-active soci-snapshotter-grpc


lima sudo nerdctl run --net host --rm --platform linux/amd64 --snapshotter soci --entrypoint sh xxxx/functions/user-function:0.2.0-soci -c "/app/var/data/conda-env/bin/python3.12 -c \"import sys; print(sys.version)\""

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@prafgup prafgup marked this pull request as ready for review February 23, 2026 16:53
@prafgup prafgup requested a review from a team as a code owner February 23, 2026 16:53
Comment thread ztoc/ztoc_test.go
@@ -228,6 +228,134 @@ func TestDecompressWithGzipHeaders(t *testing.T) {
}
Copy link
Copy Markdown
Contributor

@Shubhranshu153 Shubhranshu153 Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a e2e test using pigz

Copy link
Copy Markdown
Contributor Author

@prafgup prafgup Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to change TestDecompressConcatenatedGzip to use pigz (from os.exec) instead of using the current BuildConcatenatedGzipTar method for compression.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add integration test perhaps to test container with actual pigz tests, in some places we might be using some images from ecr public which we can populate with desired test images as needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can get a minimal integration test working, which created a basic docker image locally with pigz compression and uses that to verify that ztoc get-file work fine as well as fuse mounts are present.

I think for now that should be enough and in future we can add some custom ecr image in soci-workshop-examples repository. This would probably needed to be done by some maintainer with access :)

I will update the PR soon.

Copy link
Copy Markdown
Contributor Author

@prafgup prafgup Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shubhranshu153 I have update the unit test to use pigz for compression as well as added a minimal integration test which generates a pigz based image manually, indexes it and verifies the content of it.

Also I have manually ran and verified the working of the image which was resulting in 0 span ztoc creation in #1834

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can create a image with pigz in soci-workshop-example and have that test.
Its ok to have feature extension pending for future pr but ideally should have the required test for a new feature.

Comment thread ztoc/compression/gzip_zinfo.c
Comment thread ztoc/compression/gzip_zinfo.c Outdated
@Shubhranshu153
Copy link
Copy Markdown
Contributor

Overall looks correct to me. Reading more on gzip to see if i am missing something. Would like some e2e tests here though as we dont know if it works correctly through the both creation and execution lifecycle with a pigz compressor.

@prafgup prafgup force-pushed the prafulg/multiple-gzip-members branch from 7d69a74 to 680c963 Compare February 24, 2026 11:48
@github-actions github-actions Bot added go Pull requests that update Go code testing Unit and/or integration tests labels Feb 24, 2026
Signed-off-by: Praful Gupta <prafulgupta6@gmail.com>
@prafgup prafgup force-pushed the prafulg/multiple-gzip-members branch from 680c963 to 8ce5f10 Compare February 24, 2026 11:52
@prafgup prafgup mentioned this pull request Feb 25, 2026
@prafgup
Copy link
Copy Markdown
Contributor Author

prafgup commented Mar 18, 2026

Hey @Shubhranshu153 just nudging regarding this :) is there something else I should be supporting with here? 😄

@sondavidb
Copy link
Copy Markdown
Contributor

I'll take a look myself as well — sorry this slipped under my radar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Pull requests that update Go code testing Unit and/or integration tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] All index in ztoc have start_span and end_span 0 even with offset reaching 250mb+

3 participants